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Abstract:  Like  death  and  taxes  Equipment  Protection  System  failures  (alarms)  are 
undesirable,  unpredictable,  unwarranted,  and  unexplainable.  95%  of  industrial111  alarms 
are  false,  99%  in  aircraft,  and  99.9%  in  security  systems.  They  excite  management, 
incite  environmentalists,  spur  regulatory  agencies,  and  frighten  many.  Product  loss  and 
wasted  resources  are  obvious  consequences.  Not  so  obvious  is  the  negative  impact  on 
safety.  Fail-safe...  shutdown  upon  failure...  doesn't  make  the  situation  safer.  Instead, 
there's  a  high-risk  of  damage  or  catastrophe  during  restart. 
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Technological  Advance  Inadequate:  Pneumatics  and  relays  were  used  initially.  The 
'50s  saw  solid-state,  PLC's  emerged  in  the  70s,  PC's  in  the  '90s.  This  trend  is  a 
Paradigm-Shift...  the  unquestioned  use  of  habitually  employed  hi-tech.  Computers  have 
improved,  but  I/O  devices  haven't...  performance  is  still  poor.  This  paradox  is  evident 
in  the  Process  industry...  advance  is  evident,  but  false -trip  rates  haven't  decreased. 

Failures  Are  Predictable:  Overt  failures  are  simple  and  obvious.  Covert  ones,  aren't. 
There  are  just  two  types:  Electrical-  opens,  shorts,  grounds,  corrosion,  maloperation, 
etc;  and  Transients-  intermittents,  glitches,  vibration,  power  disturbances,  etc. 

The  Good,  The  Bad,  and  The  Ugly:  Reliability  equations  use  a  2-state  analysis,  that 
is,  components  are  either  GOOD  or  NO-GOOD.  But,  the  results  of  this  technique  can 
be  in  error.  In  reality  there  are  3-states:  GOOD  (successful);  BAD  ( unwarranted );  and 
UGLY  (i unresponsive ),  which  require  a  3-state  analysis. 


Probabilistic  Engineering  Techniques  to  Reduce  False  Alarms: 

•  The  anatomy  of  a  Protection-System  is  presented  and  its  parameters  evaluated. 

•  2-state  and  3-state  analyses  are  compared. 

•  Sacred-cows  are  exposed,  including  the  popular  software-based  TMR  scheme. 

•  Mathematical  models  considering  Good,  Bad,  &  Ugly  failure -modes  are  developed. 

•  Examples  of  the  technique  are  illustrated. 
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The  Anatomy  of  an  EPS  System:  Fig.  1  identifies  major  elements  of  an  EPS  system. 


Figure  1:  ESD  Block  Diagram. 

Essentially  an  EPS  consists  of  seven  major  components: 

•  Input,  which  are  the  tripping-variable  sensors,  their  connecting  field  wiring  and 
their  terminations. 

•  Logic,  which  receives,  processes,  and  then  executes  the  pre-programmed  logic.  It 
may  also  include  analysis  of  the  final  element's  response  (feedback). 

•  Output,  which  are  the  final  output-action  actuating  devices  (valves,  breakers,  etc.) 
their  wiring,  and  their  terminations. 

•  Operator  Control,  which  provides  the  operator  with  the  means  to  bypass,  trip,  reset, 
arm,  etc. 

•  Operator  Display,  which  provides  concise  information  regarding  EPS  status. 

•  Data  Acquisition,  which  sequentially  tags  and  time-stamps  the  EPS  events. 

•  Power  Supply,  which  provides  both  the  logic-level  and  output-action  power. 


2-State  Probabilistic  Engineering  Analysis:  Fig.  2,  illustrates  that  with  a  2-state 
analysis  (sensor  is  Good  or  NoGood)  the  probability151  of  system  success  increases  with 
the  addition  of  redundant  elements,  which  is  the  perception  with  voting  configurations: 


Redundancy 
(Failure-Mode  Excluded): 

Non-Redundant 

Redundant 

pswc  -  0.70 
qswc  -  0.30 

Psys  -  0.70 
Qsys -0.30 

Psys  -  0.91 

Qsys  -  0.09 

Where: 


P.wc= 


q.wc=i-p.«c= 


PW.=Pn«= 


Qjj«=tbwc= 


The  probability  of  a  sensor 
being  GOOD  (successful). 

the  probability  of  a  sensor 
being  NO-GOOD 
(unsuccessful). 

the  probability  of  system 
success. 

the  probability  of  system 
failure. 


Figure  2:  Two-state  analysis. 


p^c  and  q^,  respectively.  In  this  example 
Psys  increases  to  0.91,  a  formidable 
improvement  over  the  single  switch  case.  Fig.  2  illustrates  that  in  a  2-state  analysis[61 
the  probability  of  system  success  does  increase  with  redundancy. 


where: 

p^.=0.7=  the  probability  of  a  sensor 
being  GOOD  (successful). 

q.  =  0.1=  the  probability  of  the  sensor's 
BAD  (unwarranted). 

=  0.2=  the  probability  of  the  sensor's 
UGLY  (unresponsive). 

The  probability  of  success  formula  has 
been  modified  (see  Appendix)  to  include 
a  sensor's  fault-modes,  overt  and  covert. 
The  fault-mode  probabilities  are  q0,  and 
qc  respectively.  Applying  these  values  to 
the  first  example,  the  probability  of 
system  success  becomes,  Pgys  =  0.77, 
which  is  considerably  poorer  than  the 
expected  0.91.  And,  if  qG  and  qc  are 
interchanged  the  situation  worsens...  Psys  is 
further  reduced  to  0.63.  In  summary,  when  fault-mode  is  included,  an  EPS  relying  on 
redundant  switches,  does  not  automatically  result  in  higher  reliability. 


Redundancy 

(Failure-Mode  Considered): 

qswc  ■  qo  +  qc 

Psys  -  0.77 

t  t 

0.1  0.2 

Qsys  -  0.23 

Reverse  qo,  qc 

Psys  -  0.63 

qo  -  0.2 
qc-  0.1 

Qsys  -  0.37 

Figure  3:  Three-state  analysis  of  a  switch 
system  with  overt  and  covert  failures 
considered. 


Design  Factors  Affecting  EPS  Performance:  Other  design  factors  affect  system 
performance  and  therefore  must  be  considered.  They  include  transients,  connecting  wire 
failures,  termination  failures,  control  power  source  grounding  preference,  and  trip-mode 
philosophy. 
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Figure  4:  Factors  effecting  ESD  performance. 


Following  is  a  description  of  each  of  the  factors  noted  above: 

•  Transients  include  contact  bounce,  relay  chatter,  power  dips,  radio  frequency 
interference  (RFI),  electromagnetic  interference  (EMI),  X-ray  effects,  etc. 

•  Control-power  source  grounding  refers  to  whether  it  is  intentionally  grounded  or 
floating.  Alternating-current  control-power  sources  are  usually  grounded,  while 
Direct-current  control-power  sources  can  be  either  grounded  or  floating. 

•  Trip-mode  philosophy  refers  to  the  EPS  response-mode:  Energize-to-Trip  or 
ETT,  sometimes  referred  to  as  production-safe;  and  Deenergize-to-Trip  or  DTT, 
often  referred  to  as  fail-safe. 

•  Connecting  wire  failures  consist  of  open-circuits,  short-circuits,  and  ground- 
faults. 

•  Termination  failures  include  corrosion  or  accidental  bridging  of  wire -strands  at 
adjacent  terminals. 

Many  plants  with  machinery  EPS  (power  plants  in  particular)  use  the  ETT  philosophy 
because  of  the  perception  that  ETT  is  blind  to  overt  or  false-trip  failure-modes. 
Conversely,  boiler  and  furnace  flame-guard  systems  have  DTT  as  required  by  National 
Fire  Protection  Association  (NFPA)  Standards  since  DTT  is  better  able  to  prevent  covert 
or  catastrophic  failure-modes.  An  explanation  of  each  type  follows. 


Example  of  a  Typical  ETT  Input  Loop:  Consider  the  circuit  shown  in  Fig.  5.  A 
normally-open  sensor  is  powered  by  a  floating  power-source.  This  circuit  must  close 
to  energize  the  trip  relay,  which  in  turn  actuates  the  final  element  such  as,  a  valve, 
breaker,  etc.  Transients  like  power  dips  or  bounce,  will  not  cause  false-alarms.  On  the 
other  hand  an  open  circuit  in  the  sensor's  connecting  wires  or  its  protective  fuse  operates 
prematurely,  then  the  EPS  will  not  trip  when  required,  causing  a  catastrophic  failure. 


Example  of  a  Typical  DTT  Input  Loop:  Now  consider  the  circuit  shown  in  Fig.  6. 
A  normally-closed  sensor  is  powered  by  a  grounded  power-source.  This  circuit  must 
open  to  deenergize  the  final  element.  The  system  will  fail  catastrophically  if  the  sensor, 
its  connecting  wires  or  its  terminations  are  short-circuited  and  none  of  the  faults  are 
detected.  Furthermore,  any  open-circuit,  ground-fault,  contact-bounce,  premature  fuse 
operation,  or  power  dip  will  result  in  a  false-trip. 
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Figure  6:  Deenergize-to-Trip  (DTT)  loop. 


Thus,  Fig.  5  is  better  for  reducing  effects  of  faults  resulting  in  false-alarms,  but  it  masks 
those  failures  which  could  result  in  catastrophic  failure.  Conversely,  Fig.  6  has  opposite 
characteristics.  Each  method  has  features  useful  in  mitigating  particular  consequences. 
ETT  is  more  widely  used  in  machine  EPS  where  inadvertent  loss  of  continuity  of  the 
machine  is  intolerable.  Conversely  DTT  finds  use  in  fired  -equipment  such  as,  boilers, 
furnaces  and  heaters  where  the  consequences  of  failure  to  trip  could  be  disastrous. 


3-State  Analysis  (w/o  Diagnostics)  for  Machinery:  Data  obtained  from  EEEE[?1, 
AICE[8],  MIL-Handbook[91,  and  US  Navy  Handbook1101  were  used  to  develop  Table  Nfl  1. 
Five  system  configurations  are  considered,  all  based  on  DC  relay  logic,  ETT  trip-mode, 
a  floating  control-power  source,  but,  fault-mode  detection  is  excluded: 


Table  N“  1:  Energize-To-Trip  System  Performance. 


LOGIC 

P 

A  swc 

PS5S 

Q  nr 

Qcat 

1  of  I 

0.95740 

0.88978 

0.01927 

1  of  2 

0.97439 

0.95356 

0.03817 

0.00827 

2  of  2 

0.94041 

0.82601 

0.00037 

0.17362 

2  of  3 

0.99686 

0.97559 

0.00110 

0.02331 

2  of  4 

0.99898 

0.99502 

0.00217 

0.00280 

Where: 

1  of  1  =  1/1,  or  lool,  or  One-of-One 

1  of  2  =  1/2,  or  loo2,  or  One-of-Two 

2  of  2  =  2/2,  or  2oo2,  or  Two-of-Two 

2  of  3  =  2/3,  or  2oo3,  or  Two-of-Three  (TMR). 

2  of  4  =  2/4,  or  2oo4,  or  Two-of-Four  (Nuclear). 

=  sensor  success  probabilities.  Calculated  using  the  2-state  technique,  considers 
only  that  the  sensor  is  Good  or  NoGood,  and  ignores  the  effects  of  fault-mode. 

P„, «  system  success  probabilities.  These  are  calculated  with  the  3-state  method  and 
includes  fault-mode  effects  of  the  sensor,  interconnecting  wiring,  terminations, 
control-power  source  grounding  preference,  and  trip-mode  philosophy. 

Qh  =  false-trip  failure  probabilities. 

Qc-  =  catastrophic  failure  probabilities. 

3-State  Analysis  (w/o  Diagnostics)  for  Fired-Equipment:  To  illustrate  that  different 
results  will  be  obtained  for  boiler  or  heater  EPS's,  the  same  failure-rates  and 
configurations  are  used  to  develop  Table  Nfl  2.  This  time,  however,  design  factors  are 
based  on  AC  relay  logic,  the  DTT  Trip-mode  philosophy,  and  a  grounded  power  source: 


Table  Na  2:  Deenergize-To-Trip  System  Performance. 


I  LOGIC 

P 

A  SWC 

psys 

Qftr 

EBU1 

1  of  1 

0.97227 

0.88949 

0.08306 

0.02745 

1  of  2 

0.97488 

0.84002 

0.15922 

0.00075 

2  of  2 

0.05415 

2  of  3 

0.99885 

0.97823 

0.01955 

0.00222 

2  of  4 

0.96297 

0.03695 

0.00008 

Impact  Of  Fault-mode  Detection  On  EPS  Performance:  Will  fault-mode  detection 
still  satisfy  the  EPS  requirements  without  compromising  safe  and  continuous  operations? 
Different  philosophies  were  a  reasonable  course  of  action  to  meet  the  unique  and 
divergent  operational  requirements  of  machinery  and  fired-equipment  shutdown  systems. 
However,  with  the  advent  of  fault-mode  detection  and  the  application  of  3-state  analysis, 
now  both  applications  can  be  served  with  one  type  of  EPS.  A  major  advantage  is  that 
operator  and  technician  understanding  of  the  system  is  simplified.  They  do  not  need  to 
wear  two  hats,  one  for  machinery,  the  other  for  fired-equipment  protection  systems. 

The  same  failure-rates  which  were  used  to  develop  Tables  Nfl  1  and  Nfi  2,  were  used  in  the 
development  of  the  Probability  values  shown  in  Table  Nfl  3,  except  that  failure-rates  were 
adjusted  to  exclude  all  detectable  failures.  Four  of  the  protection  system  types  evaluated 
earlier  were  evaluated.  A  fifth  type,  HIQ,  was  also  evaluated.  It  exceeds  2  of  3  or  TMR 
expectations  for  half  the  cost.  Following  is  a  list  of  the  5  configurations  based  on  solid- 
state  discrete-logic,  the  DTT  philosophy,  powered  from  a  grounded  control-power 
source,  and  with  fault-mode  detection  and  diagnosis  included: 


Table  N°  3:  Fault-Detection  System  Performance. 


I  LOGIC 

P 

A  swc 

Psys 

Qftr 

Qcat 

|  SNV 

0.97227 

0.96472 

0.02013 

0.01514 

DNV 

0.97488 

0.95991 

0.03986 

0.00023 

DUV 

0.96967 

0.96969 

0.03027 

|  TMV 

0.99885 

0.99919 

0.00012 

I  HIQ 

0.99892 

0.99937 

0.00016 

0.00047 

Where: 

SNV  =  Simplex-Non-Voting. 

DNV  =  Duplex-Non-Voting. 

DUV  =  Duplex-Unanimous-Voting. 

TMV  =  Triplex-Majority-Voting. 

HIQ  =  High-Integrity-Quad  (2x1/2)  Voting. 

This  example  indicates  that  the  probability  of  system  success,  P5}S,  of  an  EPS  having 
fault-mode  detection,  shows  considerable  improvement  over  the  earlier  examples  which 
do  not  have  fault-mode  detection.  Correspondingly,  the  False  Trip-rate  Probability,  Q^, 
and  the  Catastrophic  Failure  Probability,  Q^,,  are  substantially  reduced. 


3-State  Analysis  On  A  Large  Turbo-Machine:  This  case  illustrates  the  application 
of  the  3-state  analysis  to  an  FCCU  machinery-train  located  in  a  Gulf-states  area 
refinery.  The  goal...  to  achieve  a  four-year  continuous  run.  Obviously,  false-trip 
reduction  of  the  EPS  was  an  essential  consideration.  The  original  system  is  compared 
to  one  having  fault-mode  detection  and  diagnostics.  Failure-rate  data  shown  is  specific 
to  the  trip  variable's  input-sensor  type: 


CASE  STUDY: 


EPS  PERFORMANCE  STUDY  COVERING  GULF-STATES-AREA  REFINERY  FCCU  MACHINERY-TRAIN 


Scope  of  Study:  Original  Relay  System  vs  Fault-detection  Equipped  EPS  System 
Basis  of  Study: 

Base  Case:  AC  Relays;  ETT  Logic;  Grounded-Power;  w/o  Fault-Mode  Detection 
Alt'  Case:  Solid-State;  DTT  logic;  Grounded-Power;  with  Fault-Mode  Detection 


< - INPUT  SENSOR - > 

PROCESS  VARIABLES 
Regen ' r  Pressure 
Lube  Oil  Pressure 
Separator  Inlet  Temp 
Instr  Air  Pressure 
AIR  BLOWER  VARIABLES 
Axial  Displacement 
Rad  Vibr'n,  Gear  End 
Rad  Vibr'n,  Xpnd  End 
EXPANDER  VARIABLES 
Axial  Displacement 
Rad  Vibr'n.  Cplg  End 
Rad  Vibr'n.  Idle  End 
Overspeed 

BULL  GEAR  VARIABLES 
Rad  Vibr'n,  Cplg  End 
Rad  Vibr'n,  Idle  End 
PINION  GEAR  VARIABLES 
Axial  Displacement 
Rad  Vibr'n,  Gear  End 
Rad  Vibr'n.  Xpnd  End 
MOTOR/ GEN ' R  VARIABLES 
Rad  Vibr'n,  Cplg  End 
Motor  Run  Contact 

AVERAGE  VALUES 

IIIlllllIlKlIlXSXXXXaiE 

EQUIV  FAILURE-RATE 


< - ORIGINAL  (Base  Case) 


;===x======xrx=3:xxs:ctESEXir 


PERFORMANCE  IMPROVEMENT  FACTORS 


OIF,  Overall  Improvement  Factor 
FTR,  Falae-Trip  Reduction  Factor 
CRF,  Catastrophic  Risk  Factor 
BCR,  Benefit-To-Cost  Ratio 


Original |  Al tern ' v 


Logic! 

Pays 

1 

1 

Qftr 

1 

1 

Qcat 

! ! Logic! 

1  1  1 

Psys 

1 

1 

Qftr 

1 

1 

Qcat 

1/1 

1 

1 

0.88970 

1 

1 

0.01150 

1 

1 

0.09880 

1  < 

1  1 

HIQ 

i 

i 

0.99981 

1 

1 

0.00001 

1 

1 

0.00018 

1/1 

1 

1 

0.88970 

1 

1 

0.01150 

1 

1 

0.09880 

1  1 

1  1 

HIQ 

i 

i 

0.99981 

1 

< 

0.00001 

1 

1 

0.00018 

1/1 

1 

1 

0.89887 

■ 

1 

0.01530 

1 

1 

0.08583 

1  1 

1  1 

HIQ 

i 

i 

0.99981 

1 

1 

0.00015 

1 

1 

0.00004 

1/1 

1 

1 

0.88970 

■ 

1 

0.01150 

1 

1 

0.09880 

1  < 

1  1 

DUV 

i 

i 

0.98642 

1 

1 

0.00001 

< 

1 

0.01357 

1/2 

1 

1 

• 

0.93958 

1 

1 

1 

0.05126 

1 

« 

( 

0.00916 

1  1 

1  1 
•  1 

DUV 

i 

i 

■ 

0.98770 

1 

1 

• 

0.00008 

1 

1 

1 

0.01222 

1/2 

1 

1 

0.93958 

1 

1 

0.05126 

1 

1 

0.00916 

1  1 

1  1 

DUV 

i 

i 

0.98770 

1 

1 

0.00008 

1 

• 

0.01222 

1/2 

1 

1 

0.93958 

« 

• 

0.05126 

1 

1 

0.00916 

1  • 

1  1 

DUV 

i 

« 

0.98770 

1 

1 

0.00008 

1 

1 

1 

0.01222 

1/2 

1 

1 

1 

0.93958 

1 

I 

1 

0.05126 

1 

1 

• 

0.00916 

1  1 

1  1 

1  1 

DUV 

• 

i 

i 

0.98770 

1 

1 

I 

0.00008 

1 

1 

1 

0.01222 

1/2 

1 

1 

0.93958 

« 

1 

0.05126 

• 

1 

0.00916 

•  1 

1  1 

DUV 

i 

« 

0.98770 

1 

1 

0.00008 

1 

1 

0.01222 

1/2 

1 

1 

0.93958 

1 

1 

0.05126 

1 

1 

0.00916 

1  1 

1  1 

DUV 

i 

i 

0.98770 

1 

1 

0.00008 

1 

• 

0.01222 

1/2 

1 

1 

0.95967 

1 

1 

0.03188 

• 

1 

0.00846 

1  1 

1  I 

DUV 

i 

i 

0.99006 

1 

1 

1 

0.00003 

1 

1 

0.00991 

1/2 

1 

1 

1 

0.93958 

1 

1 

1 

0.05126 

1 

• 

1 

0.00916 

1  1 

1  1 

1  1 

DUV 

i 

i 

i 

0.98770 

1 

1 

1 

0.00008 

1 

1 

1 

0.01222 

1/2 

1 

1 

0.93958 

1 

1 

0.05126 

1 

1 

0.00916 

1  1 

1  1 

DUV 

t 

i 

0.98770 

1 

1 

0.00008 

1 

1 

0.01222 

1/2 

1 

1 

1 

0.93958 

1 

1 

1 

0.05126 

1 

1 

1 

0.00916 

1  1 

1  1 

1  1 

DUV 

i 

i 

i 

0.98770 

1 

1 

1 

0.00008 

I 

1 

1 

0.01222 

1/2 

1 

1 

0.93958 

1 

1 

0.05126 

1 

1 

0.00916 

1  1 

1  1 

DUV 

i 

i 

0.98770 

1 

1 

0.00008 

1 

1 

0.01222 

1/2 

1 

1 

0.93958 

1 

1 

0.05126 

1 

1 

0.00916 

1  1 

1  1 

DUV 

i 

i 

0.98770 

1 

1 

1 

0.00008 

1 

1 

1 

0.01222 

1/2 

1 

1 

1 

0.93958 

1 

1 

1 

0.05126 

1 

1 

1 

0.00916 

1  1 

1  1 

1  1 

DUV 

i 

i 

i 

0.98770 

1 

1 

1 

0.00008 

1 

1 

1 

0.01222 

1/1 

1 

1 

0.87932 

1 

1 

0.01916 

1 

1 

0.10152 

1  1 

1  1 

SNV 

i 

i 

0.98471 

1 

1 

0.00004 

1 

1 

0.01525 

— 

1 

1 

0.92677 
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<—  —  FAULT  DETECTION  EPS 


DEFINITIONS : 

Pays  ■  The  probability  of  System  success. 

Qftr  »  The  probability  of  System  false- trips. 

Qcat  *  The  probability  of  System  catastrophic  risk. 


The  ratio  of  the  Original-system  equivalent  failure-rate 
to  the  Alternative-system  equivalent  failure-rate. 

The  ratio  of  the  Original-system  average  false-trip  failure-rate,  Qftr, 
to  the  Alternative-system  average  false-trip  failure-rate. 

The  ratio  of  the  Alternative-system  average  catastrophic  failure-rate,  Qcat, 
to  the  Original-system  average  catastrophic-failure-rate. 

The  ratio  of  the  Overall -Improvement-Factor  achieved,  to  the  cost 
(normalized)  paid  for  the  improvement. 


Result:  This  unit  not  only  achieved  its  original  four-year  goal  (1,400  days),  but  thus 
far,  has  reached  6,800  days,  more  than  18 -years,  without  failure. 


The  Sacred  Cows  of  EPS  Design:  When  it  comes  to  design  of  EPS,  the  only  ones  who 
can  change  rules  are  people  most  involved  and  responsible...  engineers  who  design  them. 


"Sacred  cows"  are  the  untouchable  designs,  specifications,  etc.,  that  EPS  designers 
cherish  most.  They  are  easy  to  recognize...  anything  that  is  vehemently  defended  with 
"we've  always  done  it  this  way. "  Some  of  the  most  recognizable  are: 

•  Triplicate  Modular  Redundancy  (TMR). 

•  Personal  Bias  Establishes  Trip-Mode  Philosophy. 

•  Designers  Disregard  Operators'  Input1111. 

•  Floating  Control-Power  Sources  Improve  Probability  of  success. 

•  Uninterruptible  Power  Systems  (UPS)  Eliminate  False-alarms. 

•  Flame-Scanners  and  Vibration  Monitors  Are  Notorious  for  Causing  False-alarms. 

What  About  This  TMR  "Stuff"  Anyway:  Imagine  that  in  1994  you  bought  a  new 
vehicle  and  to  keep  the  engine  in  good  working  order  you  perform  monthly  engine  tune- 
ups.  In  1997,  you  drive  that  car  to  the  JOAP  Conference  500  miles  away.  Would  you 
expect  the  vehicle  to  perform  exactly  as  it  did  in  1994?  No,  because  in  order  for  the  car 
to  successfully  complete  the  trip,  the  tires,  shocks,  brakes,  etc.,  also  need  to  be 
maintained  in  addition  to  the  monthly  tune-up. 

TMR  or  Triplicate-Modular-Redundancy  is  the  most  touted  of  EPS  configurations.  The 
fallacy  lies  in  how  its  performance  is  measured.  MTBF  is  a  valid  means  of  evaluation, 
but,  it  falls  short  when  used  as  a  measure  of  TMR  performance.  MTBF  which  ignores 
the  effects  of  input  and  output  device  failure  probabilities  is  a  misleading  indicator  of 
performance.  When  considering  the  time-dependency  effect  of  not  only  its  computer 
(example,  the  car's  engine)  but  also  upon  its  input/output  components  (example,  the  car's 
tires,  shocks,  etc.),  a  surprisingly  different  performance  picture  emerges,  as  shown  below: 


Figure  8:  Comparison  of  1/1,  TMR, 
SNV,  and  HIQ. 


Fig.  7,  illustrates  the  fallacy  of  the  TMR  system  by  comparing  it  to  the  1/1  system  over 
a  mission  or  service  time  of  one  year.  Input  device  failure  probability  is  included.  Fig. 
7  shows  that  the  probability  of  success  of  the  TMR  is  better  than  that  of  1/1,  until  the 
cross-over  point  at  eight  months  (t=  0.693).  The  TMR  system  declines  after  this  point. 
If  the  1/1  system  sensor  is  maintained  yearly,  then,  TMR's  three  sensors  must  be 
maintained  at  eight  month  intervals  in  order  for  it  to  maintain  its  lead  in  probability  of 
success.  These  additional  considerations  negatively  impact  on  TMR's  Overall- 
Improvement-Factor.  Its  Benefit-to-Cost  ratio  is  also  reduced  because  the  maintenance 
multiplier  is  4 Y3  (3+0.693)  times  that  of  1/1. 

Fig.  8,  compares  four  systems,  1/1,  TMR,  SNV,  and  HIQ,  over  a  longer  service-time  of 
three-years.  The  probability  of  success  curves  show  that  systems  with  fault-mode 
detection  yield  much  better  performance.  After  about  2Vi-years,  SNV's  probability  of 
success  equals  that  of  1/1  and  TMR  at  their  cross-over  point.  Thus,  they  are  out¬ 
performed  by  twenty-two  months.  HIQ  illustrates  the  impact  of  higher-level  redundant 
configurations  with  fault-detection.  Although  not  shown  on  the  graph,  at  414-years 
HIQ's  probability  of  success  is  still  above  the  value  at  1/1  and  TMR  cross-over  point. 

The  conclusions  of  this  presentation  are: 

•  To  effectively  evaluate  performance,  catastrophic-risk  and  false-alarm  modes  of 
Inputs  and  Outputs  must  considered.  This  necessitates  the  use  of  a  3-state 
analysis  (Good,  Bad,  &  Ugly)  instead  of  the  usual  2-state  (Good  &  NoGood)  one. 

•  An  EPS,  using  a  DTT  mode  philosophy  with  an  intentionally  grounded  DC  power 
source,  and  fault-mode  detection,  will  achieve  the  highest  Overall-Improvement- 
Factor  and  False-Trip-Reduction  factor  and  the  lowest  Catastrophic-Risk-Factor. 

•  Of  all  logic  configurations  investigated,  the  HIQ  configuration  results  in  the 
highest  Overall-Improvement-Factor,  the  highest  False-Trip-Reduction  factor,  the 
lowest  Catastrophic-Risk-Factor,  and  the  highest  Benefit-to-Cost-Ratio. 

•  The  fallacy  of  TMR  claims  when  used  for  EPS,  is  that  they  do  not  include 
input/output  probability  of  success  over  an  anticipated  mission  or  service-time. 
When  these  parameters  are  included,  then  TMR  systems  will  show  decreased 
probability  of  success  factors,  and  a  reduced  Benefit-to-Cost-Ratio. 

•  EPS  can  be  configured  with  the  safer  Deenergize-to-Trip  logic  without  the  fear 
or  concern  about  power  dips  or  transients  causing  false-alarms. 

•  Case  studies  for  typical  fired-equipment  EPS  like  boilers  and  heaters,  also  show 
marked  improvements,  similar  to  those  of  turbo-machinery  EPS. 

•  Mathematical  models  used  to  evaluate  EPS  designs  should  consider  the  following: 
sensor  type  (eg,  pressure,  flow,  temperature,  etc);  logic  configuration  (non-voting, 
1/1,  1/2,  etc);  overt-failure  and  covert-failure  effects;  control-power  parameters 
(AC,  DC,  grounded  or  floating);  logic  element  selection  (relay,  solid-state, 
software-programmable  types);  and  output-action  devices  (starter,  valve,  breaker, 
solenoid,  AC,  DC,  grounded  or  floating). 


APPENDIX:  3-State  Probability  of  Success  Analysis 


An  EPS  input  protective  device,  such  as  a  pressure  sensor,  has  one  working  state  and  two 
failure  states.  Its  two  fault-mode  states  can  be  described  as  "o"  for  overt  (unwarranted 
tripping)  or  "c"  for  covert  (unresponsive  to  demand).  EPS  elements  (devices)  are 
connected  in  parallel  or  in  series  in  order  to  implement  various  logic  configurations. 

A)  Parallel  Networks 

A  parallel  system  comprised  of  active,  independent,  3-state  devices  will  only  fail  if  all 
devices  fail  in  the  overt-mode,  q0,  or  at  least  one  of  its  devices  fails  in  the  covert-mode, 
qc.  The  system  time-dependent  Probability  of  Success,  P,(t),  is  given  by 

P.(0  =.  6  [1  -  qoi(t)]  -  ff  qd(t)  (1) 

4  *4  izi 

where: 

t  is  time. 

n  is  the  number  of  3-state  devices  in  parallel. 

qoi  is  the  overt-mode  probability  of  the  ith  device  at  time  t. 

qd  is  the  covert-mode  probability  of  the  ith  device  at  time  t. 


The  system  overt-mode  probability,  Q^,  is  given  by 

Qn.0)  =  1  -ff  [l-q„(t)]  (2) 


Similarly,  the  system  covert-mode  probability,  Q^,  is  given  by 

CUO  =  n  qd(t)  (3) 

‘■l 

For  the  ith  device  overt-mode  and  covert-mode  failure-rates,  Xoi  and  Xd,  respectively, 
and  equating  their  sum  to  A^,  then  its  time-dependent  relationship  is  given  by 

P,(t)  =  n  [1/AJiK  +  xd  e"B}  -  n  [X0/AJ{1  -  Xd  e-B}  (4) 

t-1 

where: 

B  =  Ajt  ?  (5) 


B)  Series  Networks 

A  series  system  is  the  reverse  of  the  parallel  one.  It  will  only  fail  if  all  of  its  elements 
fail  in  a  covert-mode  or  any  one  element  fails  in  an  overt-mode.  Then,  by  duality,  its 
system  time-  dependent  Probability  of  Success,  Ps(t),  is  given  by 

P,(t)  =  n  [l/AJ{Xd  +  X*  e"B}  -  n  [KJA.Ul  -  ^  e-B} 

1*1  Cl 


(6) 
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